perm filename CMP.SJU[P,JRA] blob sn#155752 filedate 1975-04-22 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00003 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	                               STRINGS
C00007 00003	General notes.
C00009 ENDMK
C⊗;
                               STRINGS
Many  high-level languages  include operations  on  objects known  as
strings.   We define a  string to be a  linear sequence (of arbitrary
length) consisting of uppercase letters  or digits.  Thus AB or  A123
or 01A23  are strings.  We will blur  the distinction  between single
characters and the string consisting of the single character.  Thus A
is a string.  To give completeness, we include the empty string, ε. 

We wish to implement strings as a structure in a language and wish to
include a rich complement of operations on such strings. 
First we wish to be able to assign strings to string variables.  Thus
x ← ABC is to  mean that the variable, x, has been  assigned a string
ABC. 
         Other operations are:

concat[x;y]:  x and  y are  strings;  this operation  has as  value a
    representation  of  the   concatenation  of  x  with  y.     Thus
    concat[ABC;DE] gives ABCDE.  concat[ε;x] = concat[x;ε] = x. 

first[x]: x is  a non-empty string; first[x] gives  the first element
    of x.  Thus first[ABC] gives A. 

rest[x]:  x is a  non-empty string;  rest[x] gives all  but the first
    element of x.  Thus rest[ABC] = BC; rest[A]= ε, the empty string;
    and rest[ε] is undefined. 

I.   Implementation. We will  have an area  of memory  (String Space)
which  is best  visualized  as a  sequence of  cells.   Each  cell is
capable of containing one character. 

String variables  will be  accessed through  a symbol  table.   For a
non-empty  string variable,a  symbol table  entry  will consist  of a
pointer into string  space and a character-count.   The pointer  will
point  to  the first  character  of  the  designated string  and  the
character count  will determine the number of characters which are to
be included in the string.  Thus:

   x:  ( 3  ↓ )            means x has value DEF.
            →→→→→→→→→↓
                     ↓
     ...    A  B  C  D  E  F .....     string space

Describe implementations  of the operations concat,  first, and rest.
In describing concat, be  sure to take into  account the problems  of
storage management for String Space.  Things which might flash before
your  eyes might be: garbage  collection; reference counters; sharing
of values; copying of values. 

 II. Other operations: How would you implement a function to give the
length of  a string?  What  is the test  for the empty string?   What
about  equality of strings?  (hint: are  strings stored uniquely?  if
so, why; if not, why not)

subst[x;y;z] means "substitute the string x for  the first occurrence
(in left to right order)of the string y in the string z". Thus
subst[AB;  01;   AC0101]  gives  ACAB01.  Analyze   the  problems  of
implementing subst. For example, does the current representation help
or hurt.  Do you modify the original value of z, or what? 
General notes.
 I would suggest that you read each problem through carefully before
you begin any writing.  The problems are not meant to be tricky, but
care is required. 











                         POSTFIX EVALUATION
We wish to  evaluate expressions given in a  postfix parenthesis-free
notation.   Assume for this part of  the problem that the expressions
contain only the binary operators for plus and times and contain only
numeric constants.   You may further assume  that the expressions are
well formed. 

Thus anomalies like "23+*" or "234+" will not occur. 

I.  Describe a representation for such expressions and give a precise
  algorithm for their evaluation. 


II. Assume that we wish to extend this algorithm to handle 
   i. ill-formed input (therefore to produce error messages)
  ii. operators other than plus and times,
 iii. operators other than binary ones.

Describe  appropriate modifications  to  your algorithm  and/or  data
representation to handle these three new cases.